1,186 research outputs found
Noise resistant generalized parametric validity index of clustering for gene expression data
This article has been made available through the Brunel Open Access Publishing Fund.Validity indices have been investigated for decades. However, since there is no study of noise-resistance performance of these indices in the literature, there is no guideline for determining the best clustering in noisy data sets, especially microarray data sets. In this paper, we propose a generalized parametric validity (GPV) index which employs two tunable parameters α and β to control the proportions of objects being considered to calculate the dissimilarities. The greatest advantage of the proposed GPV index is its noise-resistance ability, which results from the flexibility of tuning the parameters. Several rules are set to guide the selection of parameter values. To illustrate the noise-resistance performance of the proposed index, we evaluate the GPV index for assessing five clustering algorithms in two gene expression data simulation models with different noise levels and compare the ability of determining the number of clusters with eight existing indices. We also test the GPV in three groups of real gene expression data sets. The experimental results suggest that the proposed GPV index has superior noise-resistance ability and provides fairly accurate judgements
Dual-layer network representation exploiting information characterization
In this paper, a logical dual-layer representation approach is proposed to facilitate the analysis of directed and weighted complex networks. Unlike the single logical layer structure, which was widely used for the directed and weighted flow graph, the proposed approach replaces the single layer with a dual-layer structure, which introduces a provider layer and a requester layer. The new structure provides the characterization of the nodes by the information, which they provide to and they request from the network. Its features are explained and its implementation and visualization are also detailed. We also design two clustering methods with different strategies respectively, which provide the analysis from different points of view. The effectiveness of the proposed approach is demonstrated using a simplified example. By comparing the graph layout with the conventional directed graph, the new dual-layer representation reveals deeper insight into the complex networks and provides more opportunities for versatile clustering analysis.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)
Yeast gene CMR1/YDL156W is consistently co-expressed with genes participating in DNA-metabolic processes in a variety of stringent clustering experiments
© 2013 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0/, which permits unrestricted use, provided the original author and source are credited.The binarization of consensus partition matrices (Bi-CoPaM) method has, among its unique features, the ability to perform ensemble clustering over the same set of genes from multiple microarray datasets by using various clustering methods in order to generate tunable tight clusters. Therefore, we have used the Bi-CoPaM method to the most synchronized 500 cell-cycle-regulated yeast genes from different microarray datasets to produce four tight, specific and exclusive clusters of co-expressed genes. We found 19 genes formed the tightest of the four clusters and this included the gene CMR1/YDL156W, which was an uncharacterized gene at the time of our investigations. Two very recent proteomic and biochemical studies have independently revealed many facets of CMR1 protein, although the precise functions of the protein remain to be elucidated. Our computational results complement these biological results and add more evidence to their recent findings of CMR1 as potentially participating in many of the DNA-metabolism processes such as replication, repair and transcription. Interestingly, our results demonstrate the close co-expressions of CMR1 and the replication protein A (RPA), the cohesion complex and the DNA polymerases α, δ and ɛ, as well as suggest functional relationships between CMR1 and the respective proteins. In addition, the analysis provides further substantial evidence that the expression of the CMR1 gene could be regulated by the MBF complex. In summary, the application of a novel analytic technique in large biological datasets has provided supporting evidence for a gene of previously unknown function, further hypotheses to test, and a more general demonstration of the value of sophisticated methods to explore new large datasets now so readily generated in biological experiments.National Institute for Health Researc
Recommended from our members
From Multiple Independent Metrics to Single Performance Measure Based on Objective Function
Copyright © The Author 2023. It is extremely common in engineering to design algorithms to perform various tasks. In data-driven decision making in any field one needs to ascertain the quality of an algorithm. Therefore, a robust assessment of algorithms is essential in deciding the best algorithm as well as in improving algorithms. To perform such an assessment objectively is obvious in the case of a single performance metric, but it is unclear in the case of multiple metrics. Nonetheless, F1 measure is widely used in cases with two metrics; F1 measure represents the harmonic mean (HM) of two metrics. Of course, there are other means, e.g., the arithmetic mean (AM) and the geometric mean (GM). As motivations for using them are intuitive and none of them are based on any objective function, it is difficult to judge objectively which is the best one. In this paper, the single metric case is examined to develop two objective functions that are applicable for any number of metrics. These two objective functions lead to two different performance measures - the distance from the origin (DO) and the distance from the ideal position (DIP). It introduces a new concept of the remaining phase space for the evaluation of the quality of a performance measure. On further and closer examinations of the original goal and the phase space of the metrics, amongst these five measures, either HM or DIP is found to be the best. Specifically, it is found that HM is the best measure at the lower performance end, while DIP is clearly the best measure at the higher performance end and is of much practical interest. Rules for deciding the best algorithm and the order of a set of algorithms are presented. These results are derived in the context of multiple independent and bounded metrics. Furthermore, several properties and detailed discussions are provided, following which some published results are reviewed in the present context to elucidate some points.10.13039/501100001809-NSFC, China, through “111 Project” (Grant Number: B20038
Recommended from our members
Convolutional-Transformer Model with Long-Range Temporal Dependencies for Bearing Fault Diagnosis Using Vibration Signals
Data Availability Statement:
The data presented in the first case study may be available on request from the first author, Hosameldin O. A. Ahmed.Copyright © 2023 by the authors. Fault diagnosis of bearings in rotating machinery is a critical task. Vibration signals are a valuable source of information, but they can be complex and noisy. A transformer model can capture distant relationships, which makes it a promising solution for fault diagnosis. However, its application in this field has been limited. This study aims to contribute to this growing area of research by proposing a novel deep-learning architecture that combines the strengths of CNNs and transformer models for effective fault diagnosis in rotating machinery. Thus, it captures both local and long-range temporal dependencies in the vibration signals. The architecture starts with CNN-based feature extraction, followed by temporal relationship modelling using the transformer. The transformed features are used for classification. Experimental evaluations are conducted on two datasets with six and ten health conditions. In both case studies, the proposed model achieves high accuracy, precision, recall, F1-score, and specificity all above 99% using different training dataset sizes. The results demonstrate the effectiveness of the proposed method in diagnosing bearing faults. The convolutional-transformer model proves to be a promising approach for bearing fault diagnosis. The method shows great potential for improving the accuracy and efficiency of fault diagnosis in rotating machinery.This research received no external funding
Recommended from our members
Intrinsic dimension estimation-based feature selection and multinomial logistic regression for classification of bearing faults using compressively sampled vibration signals
Acknowledgements: Authors wish to thank Brunel University London for their support. Data Availability Statement: The data presented in the first case study may be available on request from the first author, Hosameldin O. A. Ahmed.Copyright: © 2022 by the authors. As failures of rolling bearings lead to major failures in rotating machines, recent vibration-based rolling bearing fault diagnosis techniques are focused on obtaining useful fault features from the huge collection of raw data. However, too many features reduce the classification accuracy and increase the computation time. This paper proposes an effective feature selection technique based on intrinsic dimension estimation of compressively sampled vibration signals. First, compressive sampling (CS) is used to get compressed measurements from the collected raw vibration signals. Then, a global dimension estimator, the geodesic minimal spanning tree (GMST), is employed to compute the minimal number of features needed to represent efficiently the compressively sampled signals. Finally, a feature selection process, combining the stochastic proximity embedding (SPE) and the neighbourhood component analysis (NCA), is used to select fewer features for bearing fault diagnosis. With regression analysis-based predictive modelling technique and the multinomial logistic regression (MLR) classifier, the selected features are assessed in two case studies of rolling bearings vibration signals under different working loads. The experimental results demonstrate that the proposed method can successfully select fewer features, with which the MLR-based trained model achieves high classification accuracy and significantly reduced computation times compared to published research.This research received no external funding
Recommended from our members
Modulation classification in MIMO fading channels via expectation maximization with non-data-aided initialization
Recommended from our members
Three-stage Hybrid Fault Diagnosis for Rolling Bearings with Compressively-sampled data and Subspace Learning Techniques
To avoid the burden of much storage requirements and processing time, this paper proposes a three-stage hybrid method, Compressive Sampling with Correlated Principal and Discriminant Components (CSCPDC), for bearing faults diagnosis based on compressed measurements. In the first stage, Compressive Sampling (CS) is utilised to obtain compressively-sampled signals from raw vibration data. In the second stage, an effective multi-step feature learning algorithm obtains fewer features from correlated principal and discriminant attributes from the compressively-sampled signals, which are then concatenated to increase the performance. In the third stage, with these concatenated features, Multi-class Support Vector Machine (SVM) is used to train, validate, and classify bearing faults. Results show that the proposed method, CS-CPDC, offers high classification accuracies, reduced computation time, and storage requirement, with fewer measurements.National Science Foundation of China; National Science Foundation of Shanghai
Recommended from our members
Vibration Image Representations for Fault Diagnosis of Rotating Machines: A Review
Data Availability Statement: The vibration data used to produce some of the figures may be available on request from the first author, H.O.A.A.Copyright: © 2022 by the authors. Rotating machine vibration signals typically represent a large collection of responses from various sources in a machine, along with some background noise. This makes it challenging to precisely utilise the collected vibration signals for machine fault diagnosis. Much of the research in this area has focused on computing certain features of the original vibration signal in the time domain, frequency domain, and time–frequency domain, which can sufficiently describe the signal in essence. Yet, computing useful features from noisy fault signals, including measurement errors, needs expert prior knowledge and human labour. The past two decades have seen rapid developments in the application of feature-learning or representation-learning techniques that can automatically learn representations of time series vibration datasets to address this problem. These include supervised learning techniques with known data classes and unsupervised learning or clustering techniques with data classes or class boundaries that are not obtainable. More recent developments in the field of computer vision have led to a renewed interest in transforming the 1D time series vibration signal into a 2D image, which can often offer discriminative descriptions of vibration signals. Several forms of features can be learned from the vibration images, including shape, colour, texture, pixel intensity, etc. Given its high performance in fault diagnosis, the image representation of vibration signals is receiving growing attention from researchers. In this paper, we review the works associated with vibration image representation-based fault detection and diagnosis for rotating machines in order to chart the progress in this field. We present the first comprehensive survey of this topic by summarising and categorising existing vibration image representation techniques based on their characteristics and the processing domain of the vibration signal. In addition, we also analyse the application of these techniques in rotating machine fault detection and classification. Finally, we briefly outline future research directions based on the reviewed works.This research received no external funding
- …